Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition
نویسندگان
چکیده
Abstract In this work, we examine the ability of NER models to use contextual information when predicting type an ambiguous entity. We introduce NRB, a new testbed carefully designed diagnose Name Regularity Bias models. Our results indicate that all state-of-the-art tested show such bias; BERT fine-tuned significantly outperforming feature-based (LSTM-CRF) ones on despite having comparable (sometimes lower) performance standard benchmarks. To mitigate bias, propose novel model-agnostic training method adds learnable adversarial noise some entity mentions, thus enforcing focus more strongly signal, leading significant gains NRB. Combining it with two other strategies, data augmentation and parameter freezing, leads further gains.
منابع مشابه
Named Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملUsing Corpus-derived Name Lists for Named Entity Recognition
This paper describes experiments to establish the performance of a named entity recognition system which builds categorized lists of names from manually annotated training data. Names in text are then identi ed using only these lists. This approach does not perform as well as state-of-the-art named entity recognition systems. However, we then show that by using simple ltering techniques for imp...
متن کاملDomain-aware Evaluation of Named Entity Recognition Systems for Croatian
We provide an evaluation of the currently available named entity recognition systems for Croatian. The evaluation puts special emphasis on domain dependence. To this goal, we manually annotated a dataset of approximately 1 million tokens of Croatian text from various domains within the newspaper text genre. The dataset was annotated using a three-class named entity tagset – denoting personal na...
متن کاملExploiting Dependency Context Gazetteers for Named Entity Recognition
Modern named entity recognition (NER) systems mostly employ a supervised machine learning approach that heavily depends on local contexts. While NER systems based on local contexts provide strong baseline performance, results of recent research have demonstrated that non-local contexts can further improve the performance of these systems. In this paper, we propose the use of a context gazetteer...
متن کاملAn Active Co-Training Algorithm for Biomedical Named-Entity Recognition
Exploiting unlabeled text data with a relatively small labeled corpus has been an active and challenging research topic in text mining, due to the recent growth of the amount of biomedical literature. Biomedical named-entity recognition is an essential prerequisite task before effective text mining of biomedical literature can begin. This paper proposes an Active Co-Training (ACT) algorithm for...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Transactions of the Association for Computational Linguistics
سال: 2021
ISSN: ['2307-387X']
DOI: https://doi.org/10.1162/tacl_a_00386